Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
نویسندگان
چکیده
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.
منابع مشابه
Automatic Chemical Design using Variational Autoencoders
We train a variational autoencoder to convert discrete representations of molecules to and from a multidimensional continuous representation. This continuous representation allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Con...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملRepresentation of Adsorption Data for the Case of Energetically Heterogeneous Solid Surfaces Using Artificial Neural Network
متن کامل
Effect of one Bout Continuous Versus Intermittent Aerobic Exercise on Plasma Levels of Intercellular Adhesion Molecules 1 and Vascular Cell Adhesion Molecules 1 in Patients with Coronary Heart Disease
Introduction: Adhesion molecules play an important role in the pathogenesis of atherosclerosis and the type of training may affect the response to these indicators. Therefore, the purpose of the present study was to investigate the effect of a continuous versus interval aerobic training session on plasma levels of intercellular adhesion molecules 1 (ICAM-1) and vascular cell adhesion molecules ...
متن کاملModeling of Continuous Systems Using Modified Petri Nets
Due to the changes which may occur in their parameters, systems are usually demonstrated by some subsystems for different conditions. This paper employs Modified Petri Nets (MPN) to model theses subsystems and makes it simple to analyze them. In this method, first, the continuous transfer function is converted to a discrete transfer function and then, by MPN, system is modeled and analyzed. All...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 4 شماره
صفحات -
تاریخ انتشار 2018